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ABSTRACT 


With the increasing importance of multiple platform/multiple remote sensing missions, fast and 
automatic integration of digital data from disparate sources has become critical to the success of 
these endeavors. Our work utilizes maxima of wavelet coefficients to form the basic features of a 
correlation-based automatic registration algorithm. Our wavelet-based registration algorithm is 
tested successfully with data from the NOAA Advanced Very High Resolution Radiometer 
(AVHRR) and the Landsat/Thematic Mapper (TM), which differ by translation and/or rotation. By 
the choice of high-frequency wavelet features, this method is similar to an edge-based correlation 
method, but by exploiting the multi-resolution nature of a wavelet decomposition, our method 
achieves higher computational speeds for comparable accuracies. This algorithm has been 
implemented on a Single Instruction Multiple Data (SIMD) massively parallel computer, the 
MasPar MP-2, as well as on the CrayT3D, the Cray T3E and a Beowulf cluster of Pentium 

workstations. 
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1. REGISTRATION OF SATELLITE REMOTE SENSING IMAGERY 

1.1 Why Geo-Registration 

Digital image registration is very important in many applications of image processing, such as 
medical imagery, robotics, visual inspection, and remotely sensed data processing [1,2]. For all of 
these applications, image registration is defined as the process which determines the most accurate 
match between two or more images acquired at the same or at different times by different or 
identical sensors. Registration provides the “relative” orientation of two images (or one image and 
other sources, e.g., a map), with respect to each other, from which the absolute orientation into an 
absolute reference system can be derived. Image registration is usually motivated by such goals as 
object recognition, model matching, pose estimation, or change detection. In this paper, we will only 
refer to remote sensing applications, for which automated image geo-registration has become a 
highly desirable technique. 

In the near future, satellite remote sensing systems will provide large amounts of global 
coverage and repetitive measurements representing multiple-time or simultaneous observations of 
the same features by different sensors. Also, with the new trend of smaller missions, most sensors 
will be carried on separate platforms, resulting in a tremendous amount of data that must be 
combined. In meeting some of the NASA/Earth Science Enterprise objectives [2], the combination 
of all these data at various resolutions - spatial, radiometric and temporal - will allow a better 
understanding of Earth and space science phenomena. For example, for land cover applications, the 
combination of coarse-resolution viewing systems for large area surveys and finer resolution 
sensors for more detailed studies offer the multilevel information necessary to accurately assess the 
areal extent of important land transformations. High-resolution sensors, such as Landsat or SPOT 
are very good for monitoring vegetation changes, e.g., changes in forest cover [3-7], when 
landscape features are local in scale. However, studies at a global or continental scale at high spatial 
and temporal resolutions would require the processing of very large volumes of data. It is therefore 
necessary to combine information from both types of sensors to conduct feasible, accurate studies. 

Summarizing the different applications of geo-registration, the different categories defined by 
[1] can be described in terms of remote sensing applications and include: 

(1) data fusion, new sensor calibration [8], or multimodal registration which enables the integration 
of complementary information from different sensors as well as super-resolution [9]; 

(2) change detection [10,1 1], and Earth resources surveying, or temporal registration performed to 
monitor and measure agricultural, geological or land cover features extracted from data obtained 
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from one or several sensors over a period of time. Cloud removal is another application of temporal 
registration, when observations over several days can be fused to create cloud-free data; 

(3) landmark navigation, formation flying and planet exploration, or viewpoint registration which 
integrates information from one moving platform or multiple platforms flying together into three- 
dimensional models; and 

(4) content-based [12] or object searching, map updating, or template registration which looks for 
the correspondence between new sensed data and a previously developed model or dataset. 

All these examples illustrate how geo-registration is a critical process, and how the amount of 
data which will have to be registered in the future will grow exponentially. Some of the examples 
also show that registration will need to be performed in real-time and often without human 
intervention. These two last remarks bring us to the issue of automatic geo-registration: the final 
goal of our work is to design fast and accurate methods for the automatic registration of multi- 
sensor remotely sensed data. The scope of this paper represents the first step towards this future 
goal, showing how wavelets can be integrated in the registration process, demonstrating it with 
examples from a few Earth remote sensing sensors, and implementing this algorithm on high- 
performance parallel computers. 


1.2 Why Automatic Geo- Registration 

Automatic image registration, which has been extensively studied in other areas of image 
processing, is still a complex problem in the framework of remote sensing. Often, the most 
common approach to registration is to extract a few outstanding characteristics of the data, which 
are called control points (CP's), tie-points , or reference points', then the CP's in both images are 
matched by pair and used to compute the coefficients of a bivariate polynomial, usually of degree 
three maximum. In some cases, the CP’s are recorded geographic features, also known as Ground 
Control Points (GCP's), but sometimes the CP's correspond to high-contrast data points (such as 
crossroads, bends in rivers). Most available systems follow this registration approach, often 
assuming some interactive choice of the CP's, and those systems are not well suited for the 
automatic processing of a large number of data, because polynomials require either a few very 
accurate control points or many inaccurate ones, but well-distributed over the image. In both cases, 
it is obvious that such a point selection represents a repetitive and labor-intensive task and becomes 
prohibitive for very large amounts of data. Also, this approach is unpractical when the registration 
has to be done in real-time, for example for navigation purposes. 
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Another issue of automatic geo-registration is accuracy. Since the interactive choice of control 
points in satellite images is sometimes difficult, too few points, inaccurate points, or ill-distributed 
points might be chosen, and as a result, large registration errors created. Such errors lead to data 
analysis errors as shown in two studies reported in [13,14]. The two studies demonstrate that a 
small error in registration may have a large impact on the accuracy of global change measurements 
and that, for example, a registration accuracy of less than 0.2 pixel is required to achieve change 
detection errors of less than 10%. 

As a summary, for reasons of speed, portability, and accuracy, automatic geo-registration is an 
important requirement to ease the work load, speed up processing, and improve the accuracy in 
locating a sufficient number of we II -distributed accurate tie-points. Such a technique can also be 
integrated in an hybrid automatic-manual cooperative registration which is preferred for some 
applications. 

In the remainder of this paper we briefly review previous automatic registration methods, then 
we describe how to utilize wavelets for registration purposes and the parallel implementation of 
such a wavelet-based algorithm. Finally, some results are presented and future work is described. 


2. REGISTRATION - A BRIEF SURVEY 

2.1 Which Distortions Must be Corrected 

The distortions described in [15] for Landsat-MSS data are general enough to account for most 
sensors directed at the Earth. Distortions are defined by the combined effects of sensor operation, 
the Earth's rotation, orbit and attitude anomalies and atmospheric and terrain effects. These 
distortions, which are noticeable when dealing with only one sensor, are even larger when 
considering multiple sensors carried by multiple platforms on different orbits. Refer to [15,16] for 
a good description of distortion sources and distortion categories. Distortions of airborne sensors 
are also described in [17]. 

In this paper, we will assume that distortions relative to sensor operation. Earth s rotation, and 
atmospheric effects have been corrected as part of a “systematic’ correction. As a first 
approximation, we will only be dealing with “precision” correction for orbit and attitude 
anomalies. This type of distortion mainly corresponds to an affine transformation. A few other 
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small continuous non-linear distortions due to altitude, velocity, yaw, pitch, and roll could be 
handled by global or local polynomial transformations of higher degree. 

2.2 Registration Steps and Issues 

We assume that any new incoming sensed image is being registered relative to a known 
reference image. We also assume that sensed image and input image are synonymous. According 
to Brown [1], image registration can be viewed as the combination of four components. 

(1) a feature space, i.e. the set of characteristics used to perform the matching and which are 
extracted from reference and input data; 

(2) a search space, i.e. the class of potential transformations that establish the correspondence 
between input data and reference data; 

(3) a search strategy, which is used to choose which transformations have to be computed and 
evaluated; and 

(4) a similarity metric, which evaluates the match between input data and transformed reference 
data for a given transformation chosen in the search space. 

The transformation which gives the best match according to the similarity measure is also called the 
deformation model or the mapping function. According to some a priori knowledge of the data, 
different search spaces may be chosen. For the reasons given in section 2.1, transformations that 
are often used are rigid transformations (composed of a scaling, a translation and a rotation), affine 
transformations (composed of a rigid transformation, a shear and an aspect-ratio change, a shear 
in the x-axis transforms the x-coordinate into a linear combination of both x and y-coordinates, and 
the aspect-ratio is defined as the numerical ratio of image width to height), and polynomial 
transformations. Correlation measurement is the usual similarity metric [18], although it is 
computationally expensive and noise sensitive when used on original gray level data; using a multi- 
resolution search strategy enables large reductions in computing time and increases the robustness 
of the algorithms. Other similarity metrics are described in [19], and a comparison of isometric 
(which preserves distances) versus polynomial transformations is given in [20]. Although high- 
order polynomials are superior to isometric transformations when the deformation includes more 
than a translation and a rotation, isometric transformations are more accurate when a smaller 
number of points is available and are less sensitive to noise and to largely inconsistent tie-points. 
Several feature extraction and feature matching methods have been integrated and compared in a 
system developed by Rignot et al [21], Most previous methods are applied on an image-to-image 
basis, i.e. band-to-band of multivariate data. 
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Another way to classify registration methods is adopted by Fonseca et al. [1 1], where twelve 
recent registration methods are described according to the type of their feature extraction. 
Automatic feature extraction is either area-based or feature-based, the feature-based process 
occurring either in the spatial domain or in the transform domain. 

In this section, we will take another viewpoint to describe a majority of the registration methods 
found in the literature in the past thirty years, and which have the potential to be applied to remote 
sensing images. As previously described, a usual semi-manual registration first manually 
determines corresponding control points before computing the deformation model. Automatic 
methods can be classified into two types, those which follow a human approach, with a point-to- 
point matching, and those which take a global matching approach. In section 3, we describe a two- 
step approach in which the first step does not rely on point to point matching of control points to 
compute a first estimate of the mapping function. 

Among the methods which perform point-to-point matching [22-41], the most common control 
points are the centers of gravity of regions (Cracknell [24], Flusser [27], Goshtasby [29,30], Li et al 
[38], Ton [23], Ventura [34]) with or without region attributes such as areas, perimeters, ellipticity 
criteria, affine-invariant moments [40], and inter-regions distances. More recently, several authors 
(Corvi [25], Djamdji [26,41], Li&Zhou [32,39], Chellappa et al [35,36]) have been using features 
extracted from a wavelet decomposition. Some examples are maxima and minima of wavelet 
coefficients, high-interest points or local curvature discontinuities. A few methods [17] utilize 
Delaunay triangulation methods to progressively increase the number of accurate control points. 
The main difficulty associated with point-to-point matching occurs with missing or spurious points, 
which makes the matching process more difficult and less reliable [9,20,22,23]. 

Among the global methods [9,19,42-54], the transformation is either found by correlation or by 
optimization, in the spatial or in the frequency domain. When in the spatial domain, correlation or 
optimization is performed either in the original data or on edge gradient data. Another method 
proposed by Stockman [49] involves the computation of a “Hough accumulator” for all the 
transformations matching edge segments or vectors linking feature points. Some recent research 
(Thevenaz and Unser [42,50]) has also focused on the use of wavelets for global image registration. 

Other more recent methods are also described in [55]. We qualitatively compare our work to all 
the registration methods based on wavelet transforms in section 3. Section 4 describes how the first 
step of this approach has been implemented on parallel computers, while section 5 shows results of 
our technique applied to several remote sensing image data. 
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3. REGISTRATION USING WAVELETS 


Our goal is to define fast, accurate methods for the automatic registration of remotely sensed 
data. The accuracy of these methods will be defined by the choice of the search strategy 
(component 3 in the registration definition), which could itself be a component of a general 
planning strategy [56] choosing a trade-off between accuracy and speed of the proposed methods 
according to the user's needs. 

To solve the problem of missing or spurious points in the matching process, we propose to 
solve the registration problem in two steps: 

Step 1 The mapping function is computed globally over the images, without individual matching 
of control points. This step provides a good first estimate of the mapping function, 
chosen in this work as a composition of rotations and translations. 

Step 2 Many strong and well-distributed features are extracted throughout the images and the 
previous transformation is utilized to perform the matching. Sub-pixel accuracy can be 
achieved if the number of matched pairs is much larger than the minimum number 
required to compute the final deformation model. 

In this paper, we will focus on step 1 with an emphasis on the computational speed and on the 
ability of the developed techniques to handle multi-sensor data; these two requirements brought us 
to the utilization of multi-resolution wavelet transforms. Briefly, a wavelet decomposition of any 
given signal (1-D or 2-D) is the process which provides a complete representation of the signal 
according to a well-chosen division of the time-frequency (1-D) or space-frequency (2-D) plane. 
Through iterative filtering by low-and high-pass filters, it provides information about low- and 
high-frequencies of the signal at successive spatial scales. See references [57-61] for more detail on 
wavelet transforms. The choice of using wavelets is justified by the following reasons: 

(1) Multi-resolution wavelets, largely used for data compression and browsing, are used as a first 
step to bring the multiple types of data to the same resolution without losing significant information 
and without blurring the higher resolution data. Multi-resolution wavelet decomposition preserves 
most of all important features of the original data even at a lower resolution, especially global scale 
features such as rivers, roads, and lakes. 

(2) Further multi-resolution wavelet decomposition highlights strong image features at the lower 
resolution thus eliminating weak higher resolution features. 


- 8 - 



(3) The multi-resolution iterative search focuses progressively towards the final transformation with 
a search interval decreasing and an accuracy increasing at each iteration. This algorithm achieves 
higher accuracies with higher speeds than a full search at full resolution. 

(4) Multi-resolution wavelet transforms can be implemented very easily on a parallel computer. 


3.1 Choice of Wavelet Coefficients 


Different choices of wavelet transforms can be made; we chose to use orthonormal wavelets. 
The main advantage of orthonormal wavelets is their computational speed, but their main 
disadvantage is their lack of translation-invariance, which means that the wavelet transform does not 
commute with the translation operator. According to the Nyquist criterion, in order to distinguish 
between all frequency components and to avoid aliasing, the signal must be sampled at least twice 
the frequency of the highest frequency component; therefore when using a separable orthogonal 
wavelet transform, information about the signal may change within or across subbands [64]. In 
order to assess the useability of orthonormal wavelets for image registration, we conducted a study 
where the use of wavelet subbands was quantitatively assessed as a function of features sizes. This 
study reported in Stone et al [62] shows that using cross-correlation and orthogonal wavelet filters 
is still a useful registration scheme in spite of translation effects. The results are summarized here, 
see [62] for more details: 

• the low-pass subband is relatively insensitive to translation, provided that the features of interest 
have an extent at least twice the size of the wavelet filters. 

• the high-pass subband is more sensitive to translation, but the peak correlations are still high 
enough to provide an accurate registration. 

From the experiments reported in this paper, we also observed that a local lack of translation 
invariance is compensated by taking a global approach instead of a point-to-point matching, and by 
the combination of information from several image subbands. To further investigate this issue, 
other translation-invariant wavelet transforms [63-65,75] are now being studied. 

According to Mallat [61], an orthonormal basis of wavelets can be defined by a scaling function 
and its corresponding conjugate filter L. In this case, the wavelet decomposition of an image is 
similar to a quadrature mirror filters decomposition with the low-pass filter L and its mirror high- 
pass filter H; this decomposition is summarized in Figure 1. The filters L and H are one- 
dimensional of varying size depending on the application (see [59] for more detail). For registration 
purposes, we have tested all filter sizes between 2 and 20 and have found 4, 6, or 8 to be the most 
useful filter sizes. Below size 4, the size 2-Haar filter might create step edges in the filtered data, 
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while above size 8, edge blurring occurs. We will call LL, LH, HL and HH the four sub-images (or 
subbands) created at each level of decomposition. 

In order to choose which wavelet coefficients to use for registration purposes, we first applied 
this wavelet decomposition to the image of a human face, where strong features are easy to extract 
visually. Results, reported in (Le Moigne [66,67], show that the strongest features appear in either 
one of the two bands LH or HL, with the Low/High (LH) band emphasizing horizontal features of 
the signal and the High/Low (HL) band emphasizing vertical characteristics. The High/High (HH) 
subband theoretically corresponds to diagonal features, but as can be seen in [66], the information 
contained in the HH subband also includes a large proportion of high-frequency noise. Further 
work on the use of the different subbands can also be found in our related work [62]. From these 
observations, we chose to utilize the maxima of the LH and HL bands as the feature set of our 
registration algorithm. 

3.2 Idea of the Algorithm 

According to the registration framework given by Brown[l], our algorithm can be described by 
the four following components: 

1. The feature space 

After performing the wavelet decomposition of both reference and input images, the histograms 
of LH and HL images are computed for all levels of decomposition. Then, only those points whose 
intensities belong to the top x% of the histograms are kept (x being a parameter of the program 
whose selection can be automatic); we call these points “maxima of the wavelet coefficients” (or 
“maxima”), and these maxima form the feature space. We will see in section 5.1 that when jc is 
automatically selected, its value varies between 13% and 15%. 

2. The search space 

The search space is composed of 2-D rotations and translations. Unless otherwise specified by 
the user, the system looks for rotations with angles included in the interval [0,90degrees] and for 
translations in the interval [0, half-the-maximum-dimension-in-the-reference-image]; more 
generally, the search can be reduced by utilizing a-priori information about the sensor movement 
and the satellite navigation system, as well as their corresponding accuracies. 

3. The search strategy 

The search strategy follows the multiresolution approach provided by the wavelet 
decomposition. At the highest level of decomposition, the search is exhaustive over the whole 
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search space but with an accuracy equal to A. The first approximation of the best rotation, R n , is 
chosen over this search space; then R n becomes the center of a new search interval of length 
2A, [R„- A , R n + A], and at the next lower level, the new approximated rotation, R n .,, is found within 
this search interval with an accuracy of A/2. This process is repeated until the first level of wavelet 
decomposition, where the search interval is [R 2 - A/2" *, R 2 + A/2" 2 ] and the final registration 
rotation, R,, is found with an accuracy equal to A/2" In particular, if 8 is the desired registration 
accuracy, A is chosen as 2" '8, where n is the number of levels of wavelet decomposition. Table 1 
summarizes this search strategy, and Table 2 summarizes the algorithm for four levels of 
decomposition, an image of size 512x512, a search for rotations, and an initial accuracy, A, equal to 

10 degrees. 

In some cases, for reasons of robustness, even if the final desired accuracy 8 does not require A 
to be small at the lowest resolution level, smaller accuracy steps can be considered for the initial 
step, and then the previous process described in Table 1 can be applied starting at level n-1, once the 
initial approximation of the transformation has been computed. This strategy avoids pursuing a 
false path in the search for the optimal transformation. Other authors [51] have proposed to 
simultaneously pursue several paths, but this requires more computations. 

Similarly, in the case of a composition of a rotation and a translation, the search is performed 
simultaneously on the three parameters, rotation angle, shift in x-direction and shift in y-direction. 
To reduce the amount of computations, another solution is shown in Table 3 where the search over 
three parameters is decomposed at intermediate levels into two searches in the complementary sub- 
spaces, sub-space of translations and sub-space of rotations. At each intermediate level of 
decomposition, the rotation is first assumed to be known and the translation is refined, then the 
translation is assumed to be known and the rotation is refined. Decomposing the main search into 
two “sub-searches” reduces dramatically the amount of computations, as shown in Table 3. A first 
approximation of the scale can be found by utilizing a Fourier-Mellin transform [46] or a log-polar 
transform [77] that transform the search for rotation or scale into a search for translation in the new 
transformed space. Once an approximation of the scale is found, a similar approach using sub- 
searches" can be taken when searching for a rigid or an affine transformation. 

4. The similarity metric 

The similarity metric of our registration algorithm is a cross-correlation measure. At each level, 
the best transformation is found by computing the successive correlations between wavelet 
coefficients of the input image and transformed wavelet coefficients of the reference image. The 
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transformation which corresponds to the highest correlation is chosen as the center of the next 
search interval. 

The registration algorithm is then described as follows: 

(A) Wavelet Decomposition of Reference Image. 

(B) Wavelet Decomposition of Input Image. Find maxima of LH and HL coefficients for each 
level of decomposition after histogram computation. 

(C) Starting from last level of decomposition and then iteratively for each level going up, find the 
best match between maxima of wavelet coefficients of input image and rotated and/or 
translated maxima of the wavelet coefficients of the reference image. The best transformation 
is then refined iteratively and with increasing accuracy, as previously described. 

Compared to an edge-based correlation performed on the original data which would require 360 
floating point operations per pixel, our wavelet-based registration only requires 1 10 floating point 
operations per pixel. 

Unlike our method, most wavelet-based registration approaches (as described in section 2.2) 
utilize a control-point matching approach, except for the methods described by Thevenaz and Unser 
[42,50] and by Olivo [48], which differ either by the type of wavelets, the type of features and/or the 
similarity metric, as well as by the application domain. A feature space similar to ours is used by 
Corvi [25] and Djamdji [26], although Corvi also uses wavelet minima. Djamdji's approach also 
differs by the type of wavelet decomposition, which is an "algorithme a trous" with a non- 
decimating approach. Li and Zhou [32,39] compute multi-resolution edges and high-interest points 
on these contours to perform the matching. Zheng and Chellappa [36] utilize a multi-resolution 
Gabor wavelet, the maxima of an energy measure to find local curvature discontinuities and a local 
correlation on windows around these maxima to perform the point-to-point matching. The latter two 
methods have been mainly tested on aerial and SAR imagery. 

We have tested our method on several scenes of remote sensing imagery (as well as human face 
images, see [66]). Our approach is summarized by the following characteristics: 

- the matching is performed globally instead of point by point which, in the absence of a-priori 
knowledge, decreases the risk of computing a match based on too few inaccurately paired points, 

- the search strategy takes advantage of the multi-resolution decomposition to speed up the process 
and to reduce the effect of noise, since lowering image resolution removes high-frequency noise; 

- the matching is computed on significant features (the maxima) instead of all pixels, which 
increases the control of the algorithm over noise, and reduces computation time; and 
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- the tuning of parameters is minimal. The only parameter, x, which controls the number of maxima 
chosen from the histograms, can be computed adaptively within the program, as will be explained in 
section 5.1. 


4. PARALLEL IMPLEMENTATION 

Since both the need for image registration and the amount of data to register are going to grow 
tremendously in the near future, the implementation of automatic image registration methods on 
high-performance computers needs to be investigated. For this reason, our algorithm has been 
implemented on a massively parallel computer, the MasPar MP-2, as well as on other parallel 
computers [68-70]. The MasPar Parallel Processor is a fine-grained, massively parallel computer 
with a Single Instruction Multiple Data (SIMD) architecture, consisting of 16,384 parallel 
processing elements arranged in a 128x128 matrix and connected by an eight nearest neighbors 
interconnection network. The reasons for using a MasPar were two-fold: first, the MasPar was very 
easily accessible to our research group, second a SIMD architecture is very appropriate to the 
computation of wavelet decomposition, which is essentially a pixel-parallel type processing. Other 
tests implementing a wavelet decomposition using a Cray T3D, an Intel Paragon, a Beowulf 
computer and a Convex SPP are reported in [68] and show that the MasPar provides the best 
speed-up, followed by the Cray T3D and then the Beowulf computer. The Beowulf, a Commercial- 
Off-The-Shelf (COTS) parallel computer, is a cluster of PC’s running parallel Linux. 

Wavelet decomposition, using a succession of convolution and decimation operations, is 
implemented in a straightforward fashion on a parallel architecture. Four consecutive pixels are 
stored into four layers of the same Processing Element (PE), convolutions are performed 
simultaneously at each PE and decimation is easily obtained by considering only half the number 
of layers at each iteration. For a 512x512 image, a filter size 4, and four decomposition levels, the 
computing time on the MasPar MP-2 is approximately 0.0154 seconds, compared to 4.87 seconds 
for the sequential timing which represents an improvement of about 300. Table 4 shows some other 
timing performances on the MasPar with different filter sizes. The results on the Beowulf (1.34 
seconds) are very comparable to those obtained on the Cray T3D (0.75 seconds), which is very 
encouraging for such a COTS-type architecture. See El-Ghazawi et al [68] and Chan et al [69] for 
more details on the parallel implementation of the wavelet decomposition. 

As described previously, for registration purposes, maxima of the LH and HL wavelet 
coefficients are found by computing the histograms of these two images, and choosing for each 
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image a threshold such that the thresholded images contain at most x% of the total number of 
points in the image. For the above test, histogramming and thresholding are computed in 0.38 
seconds on the MasPar. The new locations of the rotated pixels are computed in parallel on all the 
pixels. Each rotation takes about 0.02 seconds for an image up to 128x128 (since the MasPar array 
is of size 128x128), and 0.04 seconds for a 256x256 image. No complete systematic study of the 
implementation of the wavelet-based registration was performed on the MasPar, but further studies 
on the implementation of this registration on other parallel computers are reported in [70]. Table 5 
shows some of these results. On the Beowulf architecture, the algorithm requires about 0.545 
seconds when 16 processors are utilized, while it takes 0.196 seconds on the Cray T3E and 0.52 
seconds on the Cray T3D, for the same number of processors. In general, Beowulf shows similar 
or better performances than the Cray T3D. Furthermore, although Cray T3D and Cray T3E show 
better scalability than Beowulf, Beowulf still exhibits a speed-up of 8 for 16 processors, which 
shows a good ratio performance versus cost compared to other processors. See [70] for more 
details on these results. 

5. RESULTS 

Results are shown using Landsat Thematic Mapper (see Figure 2), and AVHRR (see Figures 5 
and 6) data. For all the examples below, nearest neighbor was used as the common resampling 
function. 


5.1 Landsat/TM 

Figure 2a is a 512X512 image extracted from Band 2 of a Landsat-TM scene of the Pacific 
Northwest. Figure 2c shows Figure 2a rotated 18 degrees; this rotation has been computed 
independently to test the results given by our algorithm. Figures 2b and 2d show the four levels of 
wavelet decomposition of Figures 2a and 2c, respectively, using a Daubechies filter size 4, and four 
levels of decomposition. Figure 2a (straight) is taken as the reference image, and Figure 2c (rotated) 
is considered as the sensed image. Figure 3 shows at each level the maxima which form the feature 
space, and the rotations of these maxima for the reference image. We chose to perform the 
registration with a final accuracy of 1 degree (5=1) which, according to Table 1, imposes the initial 
search space to be [0,88] and the initial accuracy (or increment) to be 8 degrees (A=8). Then the 
successive search spaces and accuracies follow the principle given in Table 2, according to the best 
rotations found at each step. The final correct rotation is retrieved as 18 degrees. 
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The best rotation at each level is chosen as the one maximizing the sum of the correlation 
between LH bands and the correlation between HL bands. The correlation results are illustrated in 
Figure 4; these graphs show how the two correlation functions LH and HL are quite different at the 
lowest level although they share the same maximum; it shows that at this level, computing the sum 
of the two correlation measures is very useful to choose a local maximum. Then, as the search 
focuses more and more around the final transformation, the two curves become more and more 
similar, and are almost identical at the highest level of decomposition. These results suggest that as 
soon as the search space has been narrowed around the final transformation, only one of the two 
functions LH or HL could be utilized, thus reducing by half the amount of computation. 

Another issue to consider is the determination of the parameter x which defines the number of 
maxima in each image. As shown in Table 6, this parameter can be computed adaptively by the 
program; at the lowest level of resolution, the best transformation is computed successively for the 
four values 5%, 10%, 15%, and 20% of x. A correlation measure is associated to each of these four 
computations. Then the parameter x is chosen as the sum of these four thresholds weighted by their 
corresponding correlations: 

x% = 5% * Correl 5% + 10% * Correl l0% + 15% * Correl 15% + 20% * Correl 20% . 

This formula has been chosen after a few experiments that showed that below 5% too few features 
remain in the images, and above 20% too much noise is taken into account. Table 6 shows example 
details for the computation of the parameter x with different rotations and translations. For these 
examples, the adaptive value of x varies between 13% and 15%. Since noise is higher in higher 
resolution data, we have experimentally found it most effective to decrease x by an additional 2% 
for each higher level of decomposition. 

Other experiments were performed using Figure 2 as well as a test portrait image shown in 
reference [66]. Both images were artificially transformed by a series of rotations and translations: 
results are shown in Table 7. Compared to other simple correlation-based methods such as spatial 
correlation, phase correlation, and edge-based registration, the results show that wavelet-based 
registration performs as well or better than the other methods, and that the average rotation error 
over these examples is 0.42 degrees and the average translation error is 0. 17 pixels. 

5.2 AVHRR/LAC 


Figure 5 and Table 8 present results obtained with a 512x512 AVHRR-LAC image of the 
Pacific Northwest area, which has been independently rotated by 5 degrees and translated by 10 
columns and 6 rows. Figure 5 shows the original image. The wavelet decomposition was performed 
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using a Daubechies size 4 filter. At the lowest level of decomposition, the previous adaptive search 
is performed, leading to a threshold of 11% for level 3; then at each level, this threshold is 
automatically decreased by 2%. At the lowest level, for a better robustness, as described in section 
3.3, the search is performed with increments of 4 degrees in rotations and 4 pixels in translations. 
Then, from level 3 up, the search strategy described in Table 1 is applied with final accuracies of 1 
degree in rotations and 1 pixel in translations. After the search is completed, a rotation of 5 degrees 
and a translation of (5.00,3.00) pixels are found in the 256x256 image; this result is interpolated as 
a rotation of 5 degrees and a translation of (10.00,6.00) pixels in the full 512x512 image. 


Other results were obtained with an AVHRR dataset which represents a series of thirteen 512 
rows by 1024 columns AVHRR/LAC images over South Africa, navigated and georeferenced to a 
geographic grid using an orbital model developed at the University of Colorado [71] which 
assumes a mean attitude behavior (roll, pitch and yaw) derived using Ground Control Points [72]. 
A map of the coastline derived from the Digital Chart of the World (DCW) was generated for the 
same geographic grid, and was used to visualize the registration before and after our process. 
Figure 6 shows this dataset. Since no registration ground truth was available for these images, all 
data have been manually registered, assuming only a translation transformation. The results of the 
manual registration have been verified by superimposing the binary map of the coastlines onto the 
respectively shifted images (see Figure 7). Most of the results obtained by manual registration are 
verified as accurate by the coastline map. But for some of the data, especially very cloudy images, 
manual results do not match the coastline map and cannot really be considered as “ground truth 
but only as “references” for accuracy purposes. Figure 7 shows some of the results of our 
registration algorithm superimposed with the coastline. Overall the differences of registration 
between the manual registration and our wavelet-based algorithm average 1.43 pixels and 1.05 
pixels if the two most cloudy scenes, sal43 and sal46, are not included in this computation (see 

Table 9). 


Since misregistration errors occur mainly for cloudy scenes, we can use the knowledge of the 
coastline to create a mask and reduce the search for the correct transformation in an area along the 
coastline. In this last test, the consistency of the algorithm [78] is checked by looking in an area of 
[_40,+40] pixels in the vicinity of the coastline and by performing inter-registration of all pairs of 
scenes. From all these permutations of registrations, an average translation error is computed by 
considering the sum of the errors on all triplets of scenes. After these computations, the average 
translation error is of 0.425 pixels, which demonstrates sub-pixel consistency of our method. 


- 16 - 


More generally, on-going experiments are being performed to assess the accuracy of our 
algorithm as a function of translation and rotation amounts as well as a function of noise [79]. 

6. CONCLUSION 

We have presented a method for image registration based on wavelet decomposition and 
correlation, which shows promising results for fast registration of digital remotely sensed images. 
The characteristic features of the images are computed automatically through wavelets and the 
matching is done globally over the image, instead of locally for each pair of control points. The 
deformation model is assumed to be the composition of a rotation and a translation. For rigid and 
affine transformations, the approach can be generalized by using sub-searches or by replacing the 
exhaustive search by an optimization or a statistically robust feature matching [73,74], One of the 
main advantages of this approach is its ease of implementation on high-performance parallel 
architectures, such as the massively parallel computer, MasPar MP-2, or a Beowulf cluster of 
workstations. 

The final step of the algorithm will utilize this first global transformation to locate and match 
automatically a few accurate reference points in the high resolution data and will refine locally the 
transformation. In a multi-resolution/multi-sensor framework, wavelet decomposition is first 
utilized to bring both sets of data to the same resolution, then it is used for registration purposes 
[76]. 

Future work will also include the study of other types of wavelets, the integration of cloud 
masks, when available, and a quantitative evaluation of our method relative to different test data as 
well as compared to other registration techniques; a first evaluation was performed [73] and will be 
continued. Automatic registration of remotely sensed data is a very complex problem, and as stated 
by other authors [11,21], we feel that only a future system that integrates multiple automated 
registration techniques will be able to address such a task for multiple types of remote sensing data. 
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Iterative Rotation Registration Using n Wavelet Decomposition Levels 
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Registration Algorithm - Search for Rotations, 
for a 512x512 image, 4 levels of decomposition, accuracy=l degree 
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Table 3 

Registration Algorithm • Search for Composition of Translations and Rotations, 
for a 512x512 image, 4 levels of decomposition, accuracies = 8© degrees and ST pixels. 
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Table 4 

Timings for the Wavelet Decomposition of a 512x512 Image; 
Parallel Implementation on the MasPar MP-2 


PE # 

Cray T3E 

Cray T3D 

Beowulf 

1 

2.382 

6.709 

4.664 

2 

1.303 

4.284 

2.390 

4 

0.604 

2.022 

1.279 

8 

0.330 

0.896 

0.739 

16 

0.196 

0.520 

0.545 


Table 5 

Timings for the Wavelet Registration of a 512x512 Image; 

Parallel Implementations on Cray T3D, Cray T3E and Beowulf high-performance architectures 
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Figure 2 

(a) Original Landsat-T hematic Mapper Image 
(Pacific Northwest ) 

(b) Wavelet Coefficients Corresponding to Figure la 

(c) Figure la Rotated by 18 Degrees 

(d) Wavelet Coefficients Corresponding to Figure lc 
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LEVEL 4 - Search for Rotations in [0,88 degrees] 




Rotations at Reference HL *3 


Input HL 

!: -V r 


■ 

^ O' 5 V“ 'Lr;j . 

V" '&'v 

r> v "V 

Best Rotation at Level 4 : 16 degrees 


LEVEL 3 - Search for Rotations in [(,24 degrees] 



— i- Best Rotation at Level 3 : 18 degrees 

Figure 3a 

Maxima of Wavelet coefficients for Level 4 (32x32) and Level 3 (size 64x64) 


Rotations of Reference LH s : Level 2 



Figure 3b 

Maxima of Wavelet coefficients - Level 2 (size 128x128) 
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Rotations of Reference LH's - Level I 



Best Rotation at Level 1 : 18 Degrees 

Figure 3c - Maxima of Wavelet coefficients for Level 1 (size 256x256) for Reference Images 
Rotated 17,18,19,20, and 21 degrees and Compared t o the Input Image 
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0 8 16 24 32 40 48 56 64 72 80 88 


Figure 4a 

Correlation Functions - Level 4 



13 15 17 19 21 23 


Figure 4c 

Correlation Functions - Level 2 



Figure 4b 

Correlation Functions - Level 3 



Figure 4d 

Correlation Functions - Level 1 


33 - 




Rotation 

(degrees) 

Translation 

(TX,TY) 


Correlation 


0 

(100,0) 

5% 





10% 


— i 



15% 


■■■ 



20% 


mSM 

0 

(0,50) 

5% 

HSB1 




10% 





15% 

0.59 




20% 

0.53 

0.13 

0 

(20,60) 

5% 

0.13 




10% 

0.29 




15% 

0.39 




20% 

0.34 


5 

(100,0) 

5% 

0.19 




10%; 

0.21 




15% 

0.27 




20% 

0.29 

0.13 

5 

(0,50) 

5% 

0.37 




10% 

0.48 




15% 

0.51 




20% 

0.38 

0.13 

5 

(20,60) 

5% 

0.07 




10% 

0.19 




15%> 

0.3 




20%' 

0.35 

0.15 


Table 6 

Adaptive Choice of Wavelet Histogram Threshold 


































DATA 

GIRL 

TM 

True Rotation 

5 

55 

5 

5 


re— 

re 

3 

3 

3 

mm 

■fl 

True Translation 




(6,4) 



(0.50) 

(0,0) 


(5,2) 


(5.2) 

SPATIAL TOftfeEtATTON 
(Only) Translation (pixels) 

(20,61) 

. 


- 

- 

- 

(50,0) 

(5,2) 

PHASE CORRELATION 
(Only) Translation (pixels) 

(20,60) 




- 


(50.0) 

(5,2) 














Rotation (degrees) 

5 

55 

5 

5 

3 

18 

18 

4 

4 

4 

-l 

0 

Translation (pixels) 

(0.0) 

(0,0) 

(20,60) 

(6,4) 

(19.61) 

(0,1) 

(0.51) 

(0,0) 

(50.0) 

(5.2) 

(49,0) 

(5,2) 

VAVELET REGISTRATION 













Rotation (degrees) 

5 

55 

5 

5 

0 

18 

23 

4 

4 

4 

0 

0 

Translation (pixels) 

(0.0) 

(0.0) 

(20,60) 

(6,4) 

(20,60) 

(0,0) 

(0.50) 

(0.0) 

(50.0) 

(4.2) 

(50.0) 

(6,2) 

iVAVE-REG: Error Rotatioi 

0 

0 

0 

0 

0 

0 

5 

0 

0 

0 

0 

0 

Error-Translation 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

1 


Table 7 

Results of our Wavelet-Based Registration Compared to Other Methods and Applied to Two 
Test Images (a Portrait, GIRL, see ref [66] and Figure 2a) 
for Multiple Rotations and Translations 
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Figure 5 

Original AVHRR Image (Pacific Northwest ) 
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F IN D BEST TRA NS FORMATION A T L EV EL 4, 

search theta in [ 0.00,10.00] and (TX,TY) in [ 0.00,16.00]X[ 0.00,16.00], 
with the increment: 4.00 


for the threshold = 
for the threshold = 
for the threshold = 
for the threshold = 


0.05 and the rotation = 
0.10 and the rotation = 
0. 1 5 and the rotation = 
0.20 and the rotation = 


0.00 Translation_optimal 
0.00 Translation_optimal 
0.00 Translation_optimal 
0.00 Translation_optimal 


( 

( 

( 

( 


for the threshold = 
for the threshold = 
for the threshold = 
for the threshold = 


0.05 and the rotation = 
0. 1 0 and the rotation = 
0.15 and the rotation = 
0.20 and the rotation = 


4.00 Translation_optimal 
4.00 Translation_optimal 
4.00 Translation_optimal 
4.00 Translation_optimal 


for the threshold = 
for the threshold = 
for the threshold = 
for the threshold = 


0.05 and the rotation = 
0.10 and the rotation = 
0.15 and the rotation = 
0.20 and the rotation = 


8.00 Translation_optimal 
8.00 Translation_optimal 
8.00 Translation_optimal 
8.00 Translation_optimai 


( 

( 

( 

( 

( 

( 

( 

( 


1.12, 

0.56) 

1.12, 

0.56) 

1.12, 

0.56) 

1.12, 

0.56) 

0.67, 

0.00) 

1.33, 

0.00) 

1.75, 

0.00) 

1.75, 

0.59) 

0.40, 

0.00) 

0.40, 

0.00) 

1.07, 

0.00) 

2.13, 

0.00) 


At level 4, Rotation_optimal = 0.00 and Translation_optimal = ( LOO , 0.00) 
and the Correlation is : 0.312404 


FIND BEST TRANSFORM ATLQJiAI LEVEL 3 

search theta in [-8.00, 8.00], (TX,TY) in [-6.00,10.00]X[-8.00, 8.00] 
with the increment: 4.00 and the threshold: 0.1 1 

At level 3, Rotation_optimal = 8.00 and Translation_optimal = ( 2.00 , 0.00) 
and the Correlation is : 0.202801 


inND-B£SXl.RAN^F-QEMAll.QIS-AI.LE y LL 2 

search theta in [ 4.00,12.00], (TX,TY) in [ 0.00, 8.00]X[-4.00, 4.00] 
with the increment: 2.00 and the threshold: 0.09 

At level 2 y Rotation_optimal = 6.00 and Translation_optimal = ( 2.00 , 2.00) 
and the Correlation is : 0.238743 

FIND BEST TRA N SFOR M ATION A T LEV E L 1 

search theta in [ 4.00, 8.00], (TX,TY) in [ 2.00, 6.00]X[ 2.00, 6.00] 
with the increment: 1 .00 and the threshold: 0.07 

At level 1, Rotation_optimal = 5.00 and Translation.optimal = ( 5.00 , 3.00) 
and the Correlation is : 0.648763 


INTERPOLATION TO FULL IMAGE: Rotation = 5.00 and (TX,TY) = (10.00, 6.00) 


Table 8 

Results of our Automatic Wavelet-Based Registration Applied to Figure 3 
Search for Compositions of Rotations and Translations 
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Figure 6 - Third Dataset - Coastline Map and Thirteen Images 
of a Multi-Temporal Series of A VHRR-LAC Band 2 Images over South Africa 


- 38 - 




Figure 7 

Zoom on Coastlines Transformed by Wavelet-Registration , 
Superimposed for Two of the AVHRR Images, u avhrr_sal26, avhrr_sal488” 
and Compared to the Manual Registration Results 
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1 AVHRR DATA 

sal 244 

sa 1 26 

sa 1 27 

sa 1 29 

sa 1 300 

sa 1 3 1 1 

sa 1 322 

sa 133 

sa 1 4 1 1 

sa 1 43 

sa 1 46 

sa 1 488 

i MANUAL REGISTRATION 

0 

(- 1 ,- 1 ) 

0 

(- 3 ,- 1 ) 

0 

( 2 , 0 ) 

0 

0 . 0 ) 

0 

( 0 ,- 1 ) 

0 

( 0 ,- 1 ) 

0 

( 0 ,- 1 ) 

0 

M-- 3 ) 

0 

0 

( 2 , 3 ) 

Rotation 

Translation 

0 

0 , 0 ) 

0 

( 0 , 0 ) 

! WAVELET REGISTRATION 











Rotation 

0 

0 

0 

-1 

0 

0 

0 

0 

0 

0 

-2 

0 

Translation 

( 0 , 0 ) 

( 0 , 0 ) 

( 0 ,- 2 ) 

(-6,-2) 

( 4 , 0 ) 

( 2 , 0 ) 

( 0 ,- 2 ) 

( 0 ,- 2 ) 

( 0 ,- 2 ) 

( 0 .- 4 ) 

(-6,-6) 

( 2 , 4 ) 


Tab,e 9 

Results of the Wavelet-Based Registration on the AVHRR Dataset Over South Africa 
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ABSTRACT 


With the increasing importance of multiple platform/mult, pie remote sensing mtsstons. fast and 
automatic integrate of digital data from disparate sources has become cm, cal to the success of 
these endeavors. Our work utilizes maxtma of wavelet coefficients to form the basic features of a 
correlat, on-based automattc regis.rate algorithm. Our wavelet-based reg, Stratton algomhm ,s 
tested successfully with data from the NOAA Advanced Very Htgh Resolution Radiometc, 
(AVHRR) and the Landsat/Thematic Mapper (TM), which differ by translation and/or rotation, y 
the choice of high-frequency wavelet features, this method is similar to an edge-based con-elation 
method, but by exploiting the multi-resolution nature of a wavelel decomposition, our method 
achieves higher computational speeds for comparable accuracies. This algorithm has been 
implemented on a Single Instruction Multiple Data (SIMD) massively parallel computer, the 
MasPar MP-2, as well as on the CrayTSD, Ihc Cray THE and a Beowulf cluster of Pentium 

workstations. 



